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ABSTRACT 

Often there is a desire to evaluate a program, but 
there is no comparable comparison group available. This paper focuses 
on an evaluation model that can be used when there is no comparison 
group and when ^-^ere is no pretest. The method — Model A — used by the 
Chapter 1 compensatory education program is described. Model A, whicr. 
uses a pretest-posttest approach, is currently applied in most of the 
local educational agencies in the country. The one group posttest 
only design is advocated, which requires content specialists to 
identify which objectives on the posctest were included in the 
compensatory curriculum versus those included only in the regular 
curriculum. The compensatory students should perform better on 
compensatory objectives to which they were exposed in both the 
regular and compensatory programs than on regular curriculum 
objectives to which they were exposed only in the regular curriculum. 
Data from 2 successive years of an evaluation of a Chapter 1 program 
in Dallas (Texas) were analyzed. Data on 20 items were obtained fron 
over 2,000 students e. i year at each grade level. The results were 
mixed, with third-graders performing significantly better on the 
district's criterion-referenced test items included in the 
compensatory curriculum, and second-graders performing better on 
items J^ot included in the curriculum. When data for the two grades 
were combined, the results were in the expected direction. Results 
for the second year suggest that either +-he grade 2 compensatory 
curriculum or the implementation of that curriculum should be 
reviewed. Four analytical methods are outlined. Two data tables and 
eight figures are included. (TJH) 
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Problem: Often there is a desire to evaluate a program, but there 
is no comparable comparison group available. One way to solve the 
problem is to look at the gain from pretest to pocttest. (The 
Chapter 1 compensatory education program uses this design, calling 
it Model A.) (Horv^t, Tallmadge, ana Wood, 1975). But if it is also 
the case that the pretest is not available, then the program may 
have to remain unevaluated. The present paper provides an 
evaluation model that can be used when these two conditions exist- 
-when there is no comparison group and when there is no pretest. 

Model A of Chapter 1: A digression to the discussion of Model A is 
necessary at this time because that model is currently used in most 
of the LEAS in the country. Two major assumptions of Mode^ A are 
usually not tenable in actual implementation. First, the pretest 
is often used to select students into the Chapter 1 prograni, thus 
allowing the regression effect to inflate the resulting "Chapter 
1 effect." Second, the assumption that the regular program is of 
average effectiveness (the equipercenti le assumption) is often not 
a valid assumption. Since Chapter 1 eligible students cannot be 
deprived of Chapter 1 services, a particular LEA cannot know how 
their Chapter 1 students would perform as a result of the regular 
curriculum only. See Figure 1 for a schematic of this assumption, 
and the top of Figure 2 for three possible Model A results with an 
effective Chapter 1 program, and the bottom of Figure 2 for three 
possible Model A results with an ineffective Chapter 1 program. 
As car be seen in Figure 2, Model A can often result in an 
incorrect conclusion, especially when one realizes that very few 
LEAs implement curricula of average effectiveness (for that LEA). 

Procedures: The one group posttest only design avoids these two 
assumptions and as well can be utilized to evaluate a compensatory 
program when there is no comparable comparison group and when 
pretest data do not exist (Ryan, 1980). The design requires content 
specialists to identify which objectives on the posttest were 
included in the compensatory curriculum (the C objectives), and 
which objectives were included only in the regular curriculum (the 
R objectives). Figure 3 provides a schematic representation of a 
20 item or ^ with the R and C designations. The compensatory 
students sh <d perform better on those C objectives to which they 
were exposed in both the regular and the compensatory program (the 
double dosing effect), than on those R objectives that they were 
exposed to only in the regular curriculum. 
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Alia! ys is I: One could compare the percent correct on the items 
measuring the two groups of objectives. The analysis would be a 
simple t-test of the difference between two groups — one group being 
the C items and the other group being the R items, as indicated in 
Figure 3, producing a result as in Figure 4. 

Analysis II: it is possible that the items measuring the one group 
of objectives are of different difficulty than the items that are 
measuring the other group of objectives. The solution to this 
potential dilemma is to statistically equate the difficulty of the 
items by covarying the inherent difficulty of the items. One could 
use the difficulty information from either: 1)the norming sample, 
2) the non-compensatory students in the same school, 3) the results 
from the non-compensatory students in the same school in previous 
years, or 4) the results from one or more LEAs using the similar 
curriculum and similar in demographics. Since the difficulty 
information is used only as a covariate, the adequacy of the 
information is not too crucial. That is, these additional groups 
are only providing information as to the difficulty of items on the 
posttest and the groups a?e not being used as comparison groups. 
The analysis would be a covariance analysis, covarying the 
difficulty of the items. The covariate is in the last column in 
Figure 3, and would produce a result as in Figure 5. 

Analysis III: If one is concerned that the two lines in Figure 5 
might not be parallel, then that assumption could be tested by 
allowing the two lines to interact, as in Figure 6. If indeed the 
lines were not parallel, then the evaluation would be providing 
valuable information to the curriculum people. The analysis would 
be a linear interaction between the difficulty of the item and the 
type of item. 

Analysis IV: If one is concerned about the assumption of straight 
lines, then that assumption could be tested by allowing the lines 
to be curved, as in Figure 7. if indeed the lines were curved, then 
the evaluation would be providing valuable information to the 
curriculum people. The analysis would be a curvilinear interaction 
between the difficulty of the item and the type of item. 

Interpretations: If one performed analysis I, then the mean 
difference between the two groups of items would be reported. 
Analysis II would result in the difference between the two lines 
being reported. Analysis III would call for the reporting of the 
difference at selected points along the interacting lines of best 
fit, whereas analysis IV would call for depicting the two curves 
of best fit. (See any general linear models text, such as McNeil, 
Kelly, and McNeil (1975) for statistical and reporting procedures.) 

Data collected over a period of years at either the school 
level or at the LEA level could result in patterns such as those 
in Figure e. Notice that all interpretations are strictly with 
regard to the Chapter 1 program and are irrespective of the 
effectiveness of the regular program. If Model A had bean used with 
Q this data, different (and erroneous) conclusions would have been 
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obtained. If the regular program jn each of thess six LEAs was 
be effect ivr?wi?h rtV^ ' "^"''■'""^ "'^"^ <^ consi/ereo ?o 

g" 

Special concerns: The design rests heavMy on the accuracy of th* 

we?e'?nclSderin^Uf t '"^"^ obJeSi veftSa? 

fl!c?«- kJ ^'^^ curricula. The task can be made a little 

easier by using a criterion-referenced test that has been des anid 
to measure the regular curriculum. In such a cate the con?en? 
people only have to identify those objectives that ar« ?n 
compensatory curriculum. oojecLives that are m the 

i-h^ 1°®^ school systems there is the additional assumption that 
the teachers actually taught the curriculum (and that ^he s^udln?^ 
listened to and learned from the curricula) The extent 2^^^^ 
these assumptions are tenable causes problemi for ModeT a as we f 

?n favor^7tr/Lm^'e^J.\'"'''°°' °' obtaining significant'resC is 
design compensatory program in the one group posttest only 

An applied example: Data from two successive years of an eva'.u'tion 
of a Chapter 1 program in Dallas, Texas wi 1 1 now brD?esln?«d tS« 

?hf ;;ate%"'jsi:n??:[''r^^"^r ^^^^ (STEELs;rha?Va;rhrs'cf;sl?y 

Jolttest to ho?H t^e-nents was routinely administered as a 
posttest to both Chapter i and non-Chapter i students Th« 
Identification of which objectives were in thfchapter TTImoM" 
compensatory curriculum was accomplished easily by the curMcSlum 
specialists. The inherent difficulty lev^ rf *»1 

determined from the non-compe^ralor^'^studrts I'n t^e'distrTct'^s 
that data was readily available. aisLrict as 

Results: Table one contains the results for the first v^ar nf 
J9fi«r^"wK"^ evaluation design( McNeil. Berry, and' Metze 

1988). When considering all three grades together A Prior,' 
students did significantly better on items taught in the A p' or 

ri^'?ewerat°"eacTo%ar 1^^ Prog^"atwhen 't'he're's^?;; 

orth^ A Priori- Sroar«2 ^ ' . ^he .^esul ts were always in favor 

qrade i ILfi ^ ^"t. s^sni^icance was obtained only at 

nrl^l V \ l ^""^^^ °^ ^^^i^s (the unit of analysis) at each 

grade level hampered the attainment of significance 

Table 2 contains the results for the second vAnr 
implementation of the new evaluation model ( McNeil jSnes'l^rr? 

ta?l ?he" ^^^^^ ^^^^ ^^'^^^ ' students dfd nJi 

take the STEELS, so data was available for only qrade 2 and -i tv^I 

resu ts were mixed, with third-grade sLdents oe^fo^mlno 
significantly better on the items included i^ the coSoe^^^^ 



ERIC 



I«Jh^wL°" 1° ^^^H"® actually obtained from over 2000 students 

check'on th« „!,tT.t''^2^ Although process evaluations ^S^d 

SL"Sj" - ^^^^^^^^^ 

i«.m^SflJ"]®^^°"®' ""^^^ evaluation model, ease in 

implementation is a reasonable concern. Analysis I is a strafqhi- 
who"' und'eT^tan,? ^"^''^^^ iv' require an iJawJr 
* covariance, interaction , and curvilinear 
i^I ?ntlr"' for- those who Understand these conceS?s 
the interpretive value of these analyses far outweiah thi 

?J?'^'2"?ic?'^"^^^^°" Existing computer ^ackageri^ch ao 

SAS and SPSS can easily perform the calculations. 
Ha.h» Aggregation of data, state and Federal evaluators want the 
tlitl ^%<=°\l^P«ible across lEAs. If the data are transformed Jo 
Iggregitable. ^^"-^^^ht-forward procedure, the results shoSTd tl 

have trrlfr®^^^'"" °^ results. The interpretation of results will 
■ !^^.°" as did the NCE metric when it was 

P^o^-ide insights into curriculum, inservice and 

determ^tS^^^^ - ".^e"m\"Je"\y^ro:tenrUial ilS^ 

rather than by evaluators. The task can be diff icuU and Ji^I 

sScri??;s°.Uld -i/ht\rgue'rh;t"\he"con^;n? 
specialists should know both the regular and compensatory currinrV« 
well enough so that the task would not be that diff 
the case in the one application, in addition, such de^ermi nil ioni 
^^Lh h^^l^ r^^J ^" "-^^ '"^•^^^ a test adaption decision S^l 

fesrcrucfal for ^"^J"/^"'^" ^^^^ adoption ilJ^si on ?I 

n «n 1 PA'c ^ the compensatory orogram. Those items that are not 
ILtt^H^ curriculum or in the compensatory curriculum can be 
anl^ltsfr"* Which is not possible in the Model A 

Teacher implementation of curriculum. If the Chaotfir i 

Jhrana7vsfs"wn;'^'"T' "'^^^^^ ' PrcTgram as'e'xpecterihen 
eP?ective nhcIL«C°"9ly accc.se the Chapter i program of being not 

co^c?SsIon. °" °^ ^^^^"^^^ ' teachers could avoid \his 

curricCJumlTnJ'r-*""^^^/^®?^ ^" curriculum. A Chapter 1 

curriculum might focus on low-level objectives, but most tests are 

d??;fcS?t^If^'n'?Jdn'H°'^^^^^^^ rneasuredW Ue^s of eirytng 

i till «i r 122^^®? Chapter 1 curriculum is measured only by 
Items of low difficulty, then analysis I will lead to an incorrect 
conclusion, but analyses II, in, and IV will s??Tl be SppYi^aMe! 
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Testing out of level. Many compensatory students take a lower 
level test, as recommended by the developers of the Chapter 1 
evaluation models (Roberts, 1981 ). Since the same kind of curriculum 
fit determinations can be mado with an out of level test as with 
an on level test, testing out of level would not cause a problem 
with the new evaluation model. 

Summary: An evaluator may on occasion be confronted with the need 
to produce an evaluation of a compensatory program when there is 
no avai lable comparison group and when no pretest data is 
available. The design discussed in thins paper provides a tool for 
obtaining an evaluation under such constraining circumstances, 
without sacrificing any evaluation principals. 

The design is particularly valuable for two reasons. First, 
few, if any, evaluators ever find a perfect comparison group in the 
real world. In this design, the students serve as their control. 
Second, if prograin gains are evaluated over a school year, which 
they usually are, it may be inappropriate to us^ the same test ^or 
both pretest and posttest. it may be very difficult to identify a 
test which adequately measures tho objectives desired at the 
posttest and which can be administered at pretest. 



NOTE: I would like to thank Joe Ryan for initially discussing this 
design, and Napoleon Mitchell, Gail Smith, Wayne Murray, William 
Denton, George Powell, James English, and David Vines for forcing 
me to have a better conceptualization of the design. I especially 
want to thank Barbara Mathews, Jane Seibert, and Rosie Ramirez for 
identifying the items and helping me chart the unknown. 
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Figure 1. How the NCE gain is calculated in Model A. 
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Figure 2. Six possible Model A results, some of which are 
misleading. 
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1 Y Y 

2 Y Y 

3 Y Y 

4 Y N 

5 Y N 

6 Y N 

7 N N 

8 N Y 



20 



Figure 3. Sample design. 
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R .68 .78 
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Figure 4. Schema-tic results from analysis I, two group means, 
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Figure 5. Schematic results from analysis II, inherent difficulty 
as cov&riate. 
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Figure 6. Schematic results from analysis III, linear interaction. 
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Figure 7. Schematic results from analysis IV, curvilinear 
interaction, 
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LEA 


86-87 


87-88 


88-83 


89-90 


#1 


5 


5 


5 


5 


«2 


5 


5 


10 


10 


«3 


2 


2 


2 


2 


«4 


10 


10 


10 


10 


#5 


5 


5 


1 


1 


#6 


-2 


-2 


-2 


-2 



The Chapter 1 program in LEA «1 had a positive and consistent 
effect over the four years. 

The Chapter 1 program in LEA #2 had a positive effect, moreso 
after the first two year. 

The Chapter 1 program in LEA «3 had a consistent low positive 
effect in each of the four years. 

The Chapter 1 program in LEA #4 had high positive effects in each 
of the last four years. 

The Chapter 1 program in LEA #5 had a positive effect the first 
two years, but something happened in the last two years to 
eliminate the effect. 

The Chapter 1 program in LEA #6 has had a consistent negative 
effect over the last four years. 



Figure 8. Possible patterns of results from the one group posttest 
only design, along with possible interpretations. 
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Table 1. Percent Correct on STEELS Language Arts Items Included anc 
Not Included in the A Priori Curriculum, 1987--88. 





Items Included 


Items Not 


Included 






in A Prion 


in A 


Priori 






Percent 




Percent 




Probabi 1 i ty 


Grade 


Correct 


N 


Correct 


N of 


Difference 


1 


70. 1 


13 


66.9 


20 


.009 


2 


73.3 


18 


71.9 


23 


.205 


3 


66.9 


16 


65. 1 


21 


.124 


AT 1 


70. 1 


47 


68.2 


64 


.002 


Note . 


Items were adjusted 


for 


overall difficulty. 





'fable 2. Percent Correct on STEELS Language Arts Items Included ana 
Not Included in the A Prion Curriculum, 1988-89. 



Items Included 



Items Not Included 





in A Priori 


in A 


Priori 






Percent 




Percent 




Probabi 1 ity 


Grade 


Correct 


N 


Correct 


N 


of Difference 


2 


70.0 


18 


72.4 


23 


.72 


3 


70.8 


16 


64.5 


21 


.04 


All 


70.4 


34 


68.3 


44 


.12 


Note. Items 


were adjusted 


for 


overall difficulty. 
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